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Protoclusters, the progenitors of the most massive structures in 
the Universe, have been identified at redshifts up to 6.6 [ret]. 
Besides exploring early structure formation, searching for proto- 
clusters at even higher redshifts is particularly useful to probe 
the reionization. Here we report the discovery of the protoclus- 
ter LAGER-z70D1 at redshift of 6.93, when the universe was only 
770 million years old and could be experiencing rapid evolution 
of the neutral hydrogen fraction in the intergalactic medium2® 
The protocluster is identified by an overdensity of 6 times the av- 
erage galaxy density, and with 21 narrowband selected Lya galax- 
ies, among which 16 have been spectroscopically confirmed. At 
redshifts similar to or above this record, smaller protogroups with 
fewer members have been reported?! LAGER-z70D1 shows an 
elongated shape and consists of two sub-protoclusters, which would 
have merged into one massive cluster with a present-day mass of 
3.7 x 10’° solar masses. The total volume of the ionized bubbles 
generated by its member galaxies is found to be comparable to the 
volume of the protocluster itself, indicating that we are witness- 
ing the merging of the individual bubbles and that the intergalactic 
medium within the protocluster is almost fully ionized. LAGER- 
z70OD1 thus provides a unique natural laboratory to investigate the 
reionization process. 


High redshift Lyman-a (Lyqa)-emitting galaxies (LAEs) are star- 
forming galaxies with strong Lya lines, which can be effectively se- 
lected with narrowband imaging survey EE Aiming to build a statis- 
tical sample of LAEs at redshift ~ 7, we are carrying out a deep narrow- 
band imaging survey, Lyman Alpha Galaxies in the Epoch of Reion- 
ization (LAGER), utilizing the Dark Energy Camera (DECam, with a 
field of view of ~ 3 deg”) on Cerro Tololo Inter-American Observa- 
tory (CTIO) Blanco 4m Telescope and a customized narrowband filter 
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Figure 1 | Redshift distribution of spectroscopically confirmed LAEs in 
LAGER-z7OD1. The red histogram shows the redshift distribution of 16 spec- 
troscopically confirmed LAEs in LAGER-z7OD1. The black lines are the total 
transmission curve, including the full system response from atmosphere (at air- 
mass of 1.2) to detector, of DECam-NB964 and HSC-NB973. Comparing with 
HSC-NB973, the transmission curve of DECam-NB964 is more like a boxcar with 
steeper wings. The sky OH emission lines are over-plotted in green. 


DECam-NB964. The central wavelength and full-width half-maximum 
of the filter are ~ 9642 A and 92 A (see Fig. 1), corresponding to a red- 
shift range of 6.89 — 6.97 and a line-of-sight (LOS) scale of 26 cMpc. 
See Methods for more details. In the LAGER COSMOS field, we ob- 
tained 47.25 hours narrowband exposure reaching a So detection limit 
of 25.2 magnitude and a Lya sensitivity of 10*7-°° erg s~'. Combin- 
ing the deep narrowband image with the ultra deep broadband images 
from the Hyper Suprime-Cam Subaru Strategic Program (HSC SSP), 
we uniformly selected 49 z ~ 7 LAEg#) See Methods and paper] 
(hereafter Z17 and H19, respectively) for more details about the LAE 
selection. 


As narrowband imaging can constrain the redshift of LAEs to a 
very narrow range Az < 0.1, corresponding to a line-of-sight (LOS) 
distance of 30 — 45 cMpc at z = 6 — 8, it is also a promising approach 
to search for overdense structures, for example, protoclusters, in the 
early Universl Fig. 2b shows the spatial distribution (blue circles) 
and number density (blue contours) of 49 LAGER z ~ 7 LAEs in 
the whole COSMOS field as presented in H19. A high number den- 
sity region (as marked by black dashed rectangle) is clearly revealed, 
containing 14 uniformly selected LAEs in H19 (see Suppl. Tab. 1 
for the catalog). This overdense region (LAGER-z7OD1) has a scale 
of 26.4’ x 12’, and a three-dimensional volume of 66 x 30 x 26 
cMpc®. We calculate the galaxy overdensity of LAGER-z7OD1 fol- 
lowing ôg = (n — ñ)/ñ, where n and ñ are the average number 
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Figure 2 | Two-dimensional spatial distribution of LAEs in LAGER-270D1 at z ~ 7.0. a, The spatial distribution of member LAEs of LAGER-z7OD1 (a 
zoomed-in view of the black dashed rectangle region in b). Red symbols mark 21 member LAEs of LAGER-z7OD1, including 14 candidate LAEs presented by 
H19 (red stars), three additional members selected with the aid of the HSC-NB973 image (red boxes), and four lower-grade photometric candidates that have been 
spectroscopically confirmed thus not presented in H19 (red triangles). All members with spectroscopic confirmations are plotted with solid symbols while those no 
yet confirmed with open ones. Three candidate LAEs from HSC-NB973 [reft in the area which are not detected in NB964 are marked as dashed blue circles. They 
are likely at higher redshifts (beyond the probe of DECam-NB964) and are not considered as members of LAGER-z70D1. The gray dashed rectangle (26.4’ x 12’, 
corresponding to ~ 66 x 30 cMpc’?) represents the protocluster region. Note that while all 21 red symbols (open and solid) are considered as members of 
LAGER-z7OD1, only red stars (open and solid, those uniformly selected by H19) are utilized for over-density analyses. The two blue dashed squares mark the two 
sub-protoclusters, each with a scale of 12’ x 9’. b, The spatial distribution of 49 LAGER z ~ 7 LAEs (blue circles) in the whole COSMOS field presented in 
H19. The blue shadow contours show the local number densities of LAEs to highlight the overdense regions. Grey areas indicate the regions we masked out when we 


performed the analysis. 


densities of LAEs in the LAGER-z7OD1 and the COSMOS field, re- 
spectively. We obtain the galaxy overdensity of LAGER-z70D1 to be 
ôg = 5.117796, which indicates LAGER-z70D1 is a heavily over- 
dense region, compared with the average galaxy number density. See 
Methods for more details. Up to now, candidate LAEs have been se- 
lected in four LAGER fields, that is COSMOS, CDFS, WIDE12, and 
GAMAIS5A (Z17, H19, Wold et al. in preparation). Among them, 
COSMOS is the unique one showing clear overdense region(s). 

The same field was also observed with another narrowband filter 
HSC-NB973, the bandpass of which partially overlaps with that of 
DECam-NB964 (see Fig. 1). A hint of overdensity around LAGER- 
z7ODI1 is also visible among the HSC-NB973 selected LAE but 
not as strong as that seen in DECam-NB964. We stacked the DECam- 
NB964 image and HSC-NB973 image to improve the depth of the nar- 
rowband image and selected three more members of LAGER-z7OD1. 
We also plot in Fig. 2a four more members, which were selected as 
lower-grade candidates (compared with those presented in H19) but 
were later spectroscopically confirmed (see next paragraph). Note 
these 7 additional member galaxies are only used to illustrate the spa- 
tial profile of LAGER-z70D1 (but not to calculate the overdensity), 
as they were not selected in a uniform and unbiased approach. See 
Methods for details. 

Spectroscopic observations have been conducted to confirm the 
member LAEs, measure their redshifts, and remove potential contam- 
inants which may show continuum or emission lines at blueward of 
9600 A. Three members have been confirmed in a previous study 4 
We carried out new spectroscopic followups using the Inamori Magel- 
lan Areal Camera and Spectrograph (IMACS) at the 6.5m Magellan I 
Baade Telescope (Feb. 6-8, 2017 and Feb. 21-23, 2018), and the Low 
Dispersion Survey Spectrograph 3 (LDSS3) at the 6.5m Magellan II 
Clay Telescope (January 10-11 and December 29-31, 2019). The aver- 
age seeing during the observations was ~ 0.8”. We carefully reduced 
the observed data and ruled out foreground identifications for member 
galaxies. Details of the data reduction are presented in Methods and 


a dedicated spectroscopic paper in preparation (along with identifica- 
tions of LAEs outside of LAGER-z7OD1 and in other fields). In total, 
we have obtained spectroscopic confirmations for 16 member LAEs 
(red solid symbols in Fig. 2a). Lya lines were not detected in three 
additional members which were put on masks, however we are unable 
to rule them out as their Lya lines might incidentally overlap with sky 
lines, or their Lya line width be too broad to be detected (> 500 km 
s+). The two- and one-dimensional spectra of the confirmed LAEs 
are presented in the Suppl. Fig. 1 and the redshift distribution in Fig. 
1. 

The scale (66 x 30 x 26 cMpc’), overdensity (6, = 5.11*7°96), 
and LOS velocity dispersion of spectroscopic confirmations (~ 765 
km s~') of our protocluster LAGER-z70D1 are similar to those of the 
previously detected protoclusters at redshift of 5.7 — 6.6 [refs] and 
simulation prediction 514) We estimate the total present-day cluster 
mass M,—9 of LAGER-z7OD1 following the widely used formuld! 
Mz=0 = (1+ 6m)pV, where V is the volume of the protocluster, 
p (3.88 x 10'° Mo cMpc~?) is the mean matter density of the uni- 
verse, and ôm the mass overdensity. Ôm is related to the observed 
galaxy overdensity through: 1 + bôm = C(1 + ôg), where b is the 
bias parameter and C the correction factor for the redshift space distor- 
tion. For ôg = 5.11 atz ~ 7, we find C = 0.79 and ôm = 0.87. 
The present-day mass Mz=0 of LAGER-z7OD1 is estimated to be 
S0P ee x 101° Mo, comparable to the mass of nearby COMA 
cluster) (~ 2x 10° Mo). See Methods for details. 

The 3D distribution of the spectroscopic confirmations is shown 
in Fig. 3. LAGER-z7OD1 shows an elongated shape and consists of 
two sub-protoclusters. The overdensities of two sub-protoclusters are 
6.7613'55 (left) and 9.3473'2} (right), respectively, where the bound- 
aries of the two substructures are defined as the blue squares in Fig 2. 
If we treat the two substructures as isolated, their present-day masses 
are expected to be 1.391032 x 10°° Mo and 1.60732? x 107° Mo, 
respectively. 

We further explore whether the protocluster LAGER-z70D1 
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Figure 3 | 3D spatial distribution of 16 spectroscopically confirmed LAEs in 
LAGER-z70D1 at z ~ 7.0. The red points represent the 16 spectroscopically 
confirmed LAEs in LAGER-z7OD1, which are shown as the solid symbols in 
Fig. 2a. The translucent spheres denote the predicted ionized bubbles of LAEs. 
An interactive version can be found on https://www. lagersurvey.org/lager-z7od1. 


would collapse into a single cluster. Similar to previous works S19) we 
estimate the linear overdensity of LAGER-z7OD1 to be ôz = 0.54 at 
z ~ 7 (Equation 18 of ref. As the growth of linear perturbation ôz 
is proportional to t?/3, ôr will be larger than the threshold ôz > 1.69 
at z = 2, where 67 = 1.69 is the critical value of linear overdensity 
of a spherical perturbation at the time it collapsed Thus, we expect 
LAGER-z7OD1 collapses into a cluster at lower redshift. The discov- 
ery of LAGER-z7OD1 indicates that the formation of such large-scale 
structure had already begun by redshift 7.0, making it an ideal lab- 
oratory for understanding galaxy formation and large-scale structure 
formation. 

During the EoR, the hard UV photons that escaped from a galaxy 
could ionize the IGM and generate a HII region. The HII regions could 
gradually grow and merge with adjacent ones into sufficiently large 
ionized bubble PRT] which can reduce resonant scattering of Lya 
photons in the neutral IGMP] The protoclusters in the EoR may lead 
the production of such bubbles because of their high number density 
of galaxies. On the basis of the relation between the bubble size and 
Lya luminosity in a simulation of reionization work24) we show the 
predicted bubbles in Fig. 3 and the bubble sizes in Suppl. Tab. 1. The 
summed volume of the ionized bubbles of all 21 LAEs is 6.58 x 10° 
cMpc?, with the 4 most luminous ones (with Lryq > 2 x 1048 erg 
s71, i.e., LAE 1,2,3,15) contributing 60.3% of the total ionized vol- 
ume. This total ionized volume is even slightly larger than the volume 
of LAGER-z7OD1 (5.15 x 10° cMpc*). This demonstrates significant 
overlaps between individual bubbles, indicating the individual bubbles 
are in the act of merging into one or two giant bubbles (see Fig. 3). 
As a comparison, the total predicted volume of all the 49 uniformly se- 
lected LAEs in COSMOS field is 12.71 x 10° cMpc?, corresponding 
to 11.1% of the total volume surveyed by DECam-NB964. See Meth- 
ods for details. Such predicted giant bubbles are large enough to be 
resolved by future 21-cm programs, e.g., SKA1-Low with resolution 
of ~ 7.3 arcmin at z ~ 7.3 ref 22, corresponding to ~ 19 cMpc. 

The merged bubble (with predicted size of > 30 cMpc) could sig- 
nificantly increase the IGM transmission, and thus enhance the Lya 
visibility of member LAE! Note Z17 and H19 have revealed a 
bright-end excess in the Lya luminosity function in COSMOS field, 
also suggesting the existence of big ionized bubbles at z ~ 7 that re- 
duce the opacity of neutral IGM around the luminous LAEs. Mean- 
while, if the Lya transmission through the IGM has been significantly 


boosted in most LAEs in LAGER-z70D1, it may lead to larger Lya 
equivalent widths (EWs) of LAEs in the protocluster. However, the ex- 
pected larger Lya EWs is not seen, compared with the field LAEs in 
COSMOS (see Methods and Extended Data Fig. 1 for details), though 
the large uncertainties in the EW measurements and the small sam- 
ple size prevent us from reaching a robust conclusion. One possibility 
is that high-redshift protoclusters are highly biased regions and might 
contain LAEs with physical properties deviating substantially from the 
field LAEE] It is yet unclear if the intrinsic Lya escape (prior to 
IGM scattering) in clustered LAEs is the same as that in field LAEs. 

Moreover, the expected excess of close companions due to poten- 
tially enhanced Lya transmission, in or behind the large bubbles of the 
luminous LAEs (Ltya > 2 x 10® erg s71, i.e., LAE 1,2,3,15), is 
not seen (see Fig. 2a and 3). This is likely in part due to the possi- 
bility that while the biased dark matter halos can increase the galaxy 
merger/interaction, and thus enhance the star formation in the over- 
dense region) the feedback from the UV background may suppress 
the star formation in the nearby fainter ak 

The discovery of the protocluster LAGER-z70D1 provides an ex- 
cellent opportunity to probe the rise and merging of ionized bubbles 
around the midpoint of EoR. Future deep and multi-band (HST, JWST, 
ALMA, etc) observations could reveal the detailed reionization pro- 
cesses, e.g., through searching for undetected Lya fainter galaxies par- 
tially responsible for the ionization budget, better constraining the Lya 
line EWs and the Lya profiles, measuring the Lya velocity offsets rela- 
tive to their system redshifts, and mapping the star formations histories 
of the galaxies. 
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METHODS 

Throughout this study, we adopt the recent Planck cosmological 
parameter! Qm = 0.3111, Qa = 0.6889 and Ho = 67.66 km s~+ 
Mpc 1 where Qm and Q4 are the densities of total matter and dark energy 
and Ho is the Hubble constant. 


Lyman Alpha Galaxies in the Epoch of Reionization (LAGER) 
survey: LAEs are promising probes for characterizing the cosmic 
reionization BEE] We are carrying out a large-area narrowband imag- 
ing survey, Lyman Alpha Galaxies in the Epoch of Reionization (LAGER), 
to search for the LAEs at z ~ 7.0, using the Dark Energy Camera (DECam) 
installed on the Cerro Tololo Inter-American Observatory (CTIO). We 
designed and procured a narrowband filter (DECam-NB964) for the LAGER 
survey, with a central wavelength of ~ 9642 A and FWHM of ~ 92 A to 
avoid the atmospheric absorption and strong OH emission lines. The filter 
DECam-NB964 was installed on the DECam system in December 2015. 
Owing to the large FoV (~ 3 deg? ) and red-sensitive camera, LAGER is 
one of the most efficient surveys in searching for LAEs in the EoR. See the 
filter design pape] for more details. We adopt a ’wedding cake” observing 
strategy with two deep fields aiming to discover faint LAEs and several 
shallower fields aiming to discover numerous luminous LAEs. Up to now, 
candidate LAEs have been selected in 4 fields, including COSMOS, CDFS, 
WIDE12, and GAMAISA. 


LAGER-z70D1 member galaxies: 


a. Member galaxies from H19: 

The 14 LAEs from H19 were selected by narrowband technique. This 
technique is widely used in literature and has been proven effective at 
searching for LAEs. Briefly, the selection criteria in H19 include: (1) the 
signal-to-noise ratio (S/N) of DECam-NB964 signal is larger than 5; (2) 
DECam-NB964 excess over the underlying broadband to ensure the rest 
frame equivalent width (EW) of Lya is larger than 10 A; (3) non-detection 
in bluer broadbands (we adopt the recently release HSC SSP ultradeep 
broadband images), We visually inspected each LAE candidate to remove 
possible foreground galaxies and spurious objects, such as satellite trails, 
cosmic rays, etc. Finally, we obtained a clean sample of 49 LAEs in 
COSMOS field and a clear overdense region is revealed with 14 LAEs. 


b. Additional members: 

The same field was also observed with another narrowband filter 
HSC-NB9732_ the bandpass of which partially overlaps with that of 
DECam-NB964 (see Fig. 1). A hint of over-density around LAGER-z70D1 
is also visible among the HSC-NB973 selected LAE#2] but not as strong 
as that seen in DECam-NB964. Among the 8 HSC-NB973 selected LAEs 
in the area, four (LAE-1,2,11,15) were detected in DECam-NB964 and 
presented by H19. An additional source (LAE-20) shows tentative signal 
(S/N = 4.7) in DECam-NB964 image (see also next paragraph), while 
the remaining 3 (LAE-22,23,24; blue dashed circles in Fig. namely 
HSC-z7LAE24,6,16 respectively in refl!2) are invisible in DECam-NB964. 
The latter three had not been spectroscopically observed, and are candidate 
LAEs likely at slightly higher redshifts beyond the probe of NB964 (see 
Fig. 1). At the current stage, we do not consider these 3 LAEs as member 
galaxies of LAGER-z7OD1 as they may locate at slightly but sufficiently 
higher redshifts than that of the structure. 

We further stack the DECam-NB964 and HSC-NB973 images to search 
for fainter LAEs located within the common volume sampled by two filters, 
and include three more candidates (LAE-7, LAE-20 and LAE-21). We also 
plot in Fig. 2a four more DECam-NB964 LAEs (LAE-5, 6, 9, 10) which 
were selected as lower grade candidates (comparing with those presented in 
H19) but were later spectroscopically confirmed. 


The overdensity in LAGER-z7OD1: We estimate the overdensity as de- 
fined by dg = (n — 7)//, where n and ñ are the average LAE number 
densities in the LAGER-z70D1 and the COSMOS field, respectively. The 


number density of LAEs in the COSMOS is ñ ~ 00725 note arcmin? 


(49 over 1.9 deg”) and the number density of LAEs in the LAGER-z70D1 
isn ~ 0.044210:0152 arcmin™? (14 over 0.088 deg?). The errors are 
calculated based on the Poisson errors of the LAE sample size. The galaxy 
overdensity of LAGER-z70D1 is thus ôg = 51t o: The LAE sample 
suffers incompleteness during the detection and selection procedures (see 
H19 for details). However, as the narrow- and broad-band images utilized 
for LAE detection and selection have rather uniform depths throughout the 
COSMOS field, the incompleteness is constant over the field, and thus, can- 
cels out in the calculation of the overdensity. Note we use only the LAE 
sample in COSMOS field from H19 for the calculation of overdensity. The 
additional members to LAGER-z70D1 aforementioned were excluded from 
such analyses as they were not uniform selected. 

We adopt an enclosing rectangle as the boundary of the LAGER-z70D1 
(see the gray dashed rectangle in Fig. 2a) to calculate its volume and over- 
density. However, the selection of the boundary is kind of arbitrary and 
may differ from the intrinsic shape of the protocluster. This could intro- 
duce systematic errors to the estimations of the volume thus overdensity and 
present-day mass. For instance, we do not see member LAEs in the lower- 
right region of the rectangle (see Fig. 2a) but this region contributes ~ 30% 
of the volume. If we exclude this void region, the overdensity would be 
increased to ~ 7.73. Moreover, we simply calculate the overdensity using 
the light-of-sight scale (26 cMpc) probed by NB964, and the protocluster 
may have more members out of that range. Nevertheless, the effect of the 
boundary selection on the present-day mass estimation is moderate, as fur- 
ther discussed below. 

As aforementioned, the COSMOS field is unique among four LAGER 
fields, showing clear overdense region(s). The average LAE number density 
in the COSMOS field could be biased by cosmic variance, and such effect 
may also affect the calculation of the overdensity. We compare the luminos- 
ity function (LF) of LAEs in the COSMOS field with those in other three 
LAGER fields and find the LFs to agree within 1o Poisson errors (Wold 
et al. in preparation). We integrate the LFs in the luminosity range of 
1047-65 — 1043-65 erg s~1 and find the derived average LAE densities 
from the four fields agree within 15%. Thus the field-to-field variation has 
no significant effect on the calculation of overdensity. 

We finally note that we can not rule out the possibility that a small 
fraction of the uniformly selected 49 candidate from H19 are actually not 
real LAEs, but contaminants (such as noise spikes in narrowband image, 
variable sources, or foreground emission line galaxies). The total number 
of contaminated foreground emission line galaxies (Ha, [OINI] and [OII]) 
was estimated to be 0.82 in COSMOS, thanks to the ultradeep broadband 
images availabld3] Considering the contaminants are unlikely spatially 
associated with the protocluster and should distribute randomly, excluding 
such contaminants (even if possible) from the calculation would yield even 
higher overdensity. 


Spectroscopic observation and data reduction: The three brightest LAEs 
in LAGER-z7OD1 have been spectroscopically confirmed with IMACS on 
February 6-8, 2017 [reff We carried out spectroscopic followups for more 
LAEs in LAGER-z70D1 using IMACS at the 6.5m Magellan I Baade Tele- 
scope (February 21-23, 2018), and LDSS3 at the 6.5m Magellan II Clay 
Telescope (January 10-11 and December 29-31, 2019). For IMACS observa- 
tions, we used the f/2 camera (with a FoV of 27’ diameter) and the 300-line 
red-blazed grism. For LDSS3, we used VPH-Red grism and OG590 filter 
to eliminate second order contamination. Comparing with IMACS, LDSS3 
has a smaller FoV (8.4’ diameter) but relatively higher efficiency at 9600 
— 9700 A. Slitwidth of 1/’ was adopted for both instruments. The spectral 
reduction was performed using cosmos] and the single-epoch spec- 
tra are average with weights selected to maximize the S/N of the coadded 
spectra. 

The result 1D spectra (both IMACS and LDSS3) have spectral resolu- 
tion of ~ 6 A. We carefully examine the 2D spectra of all spectroscopic 
targets and identify 13 sources as LAEs based on single line detections. In- 
cluding the three previous confirmationd4 we now have spectroscopic con- 
firmations for 16 member LAEs in LAGER-z7OD1 (red solid symbols in 
Fig. 2a). The spectra of these 16 LAEs are presented in Suppl. Fig. 1. 

Single line detections (no other lines, no continuum) might still be con- 


taminated by foreground strong emission line galaxies, e.g., Ha, [Orrr], and 
[Orr]. Due to the limited spectral quality (and partial overlap with sky lines 
for some of them) we are unable to secure the characteristic asymmetric line 
profil (with a red wing) of high-z Lya lines for many sources. Mean- 
while while some lines are too narrow to be [Orr] doublet, for some broader 
ones [Orr] can not be completely ruled out based on the line profile alon] 
However, the contamination rate is expected to be low. For example, recent 
spectroscopic survey of high-redshift LAEs at z ~ 5.7 reports a low con- 
tamination rate of < 10% in their spetroscopic detection] Even we con- 
sider a contamination rate of 10%, our single line identifications would be 
reliable for most sources. More critically and fortunately, in COSMOS ultra 
deep broadband images are available to rule out almost all low-z interlopers 
of emission line galaxies, and we expect the sample of H19 include only ~ 
0.14 ({Oq7]), 0.52 ([Orrr]), and 0.16 (Ha) low-z emission line galaxies over 
the whole COSMOS field! 

LAE-8, 12, and 14 were also spectroscopically observed, but not 
yet confirmed. The non-detections of the Lya in their spectra do not 
necessarily rule them out, as their Lya lines might incidentally overlap 
with sky lines, or the velocity dispersions of their Lya lines could be too 
broad to be detected (> 500 km s71). Note we do not detect either any 
signals (lines, continuum) indicative of foreground sources in their spectra. 
These candidates which are spatially associated with the protocluster are 
more likely real LAEs instead of contaminations (such as variable sources 
or noise spikes in the narrowband image). This is because the area of 
LAGER-z7OD1 is only ~ 1/21 of the whole COSMOS filed, thus even if a 
small fraction of the 49 candidates selected over the whole field are indeed 
contaminants, we would expect no more than one of them within the area of 
LAGER-z7OD1 (assuming the contaminants randomly distribute over the 
field). Therefore we opt to keep all three of them as valid candidates. We 
further note that even if we were able to secure all such contaminations over 
the whole field, excluding such contaminations from the calculation would 
yield even higher overdensity. 


Present-day mass of LAGER-z7OD1: We estimate the total present- 
day cluster mass M,— 9 of LAGER-z7OD1 following the widely used 
formula! 69} Mz=0 = (1+ 5m)pV, where V is the volume of the 
protocluster, Ø (3.88 x 101° Mo) is the mean matter density of the 
universe, and ôm is the mass overdensity. Ôm is related to the observed 
galaxy overdensity through: 1 + bdm = C(1+ g), where b is the 
bias parameter and C the correction factor for the redshift space dis- 
tortion, C = 1+ f — f(1 + ôm)!/3 and f = Qmz4/7. The bias 
parameter is assumed to be b = 4.54 + 0.63 (measured using z ~ 6.6 
LAES9), For ôg = 5.1 at z ~ 7, we find C = 0.79 and ôm = 0.87. 
Thus, the present-day mass Mz=0 of LAGER-z7ODI1 is estimated to be 
3.704953 x 1015 Mo (where the errors are derived through simulating 
the fluctuations of the galaxy overdensity 5, and bias parameter b). As 
aforementioned, the selection of the boundary is kind of arbitrary and 
could introduce as much as 30% uncertainty to the overdensity estimation. 
However, this effect is moderate when estimating the present-day mass of 
the LAGER-z7OD1. Decreasing the volume by 30% would yield a 17% 
lower Mz=o0 and increasing the volume by 30% would yield a 16% higher 
Mz=o. The bias parameter might have been underestimated since we 
adopted the value at z ~ 6.6, and specifically, an increase of b from 4.5 to 
5.5 will result in a ~ 7% decrease of the estimated M,—po. 


Bubble size estimation: Previous studied24 presented a semi-numerical 
simulation to investigate the relation between the bubble size and Lya lu- 
minosity of high-redshift LAEs in EoR. In the simulation, the star formation 
rate (SFR) was assumed proportional to the growth rate of the dark matter 
halo, the escape fraction of ionizing photons was assumed to be 0.2, and 
the ionizing photon emissivity was calculated based on the star formation 
history of the galaxy using the population synthesis code STARBURST! 
Finally, the evolution of ionized bubble and Lya luminosity (derived from 
the ionizing photons which do not escape) were obtained after calculating 
the radiative transfer in the IGM. Based on the relation between the bubble 
size and Lya luminosity (at z = 8, Fig. 15 in ref 25), we show the predicted 
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Extended Data Figure 1 | The HSC-y - DECam-NB964 (and Lya EW) distri- 
bution of LAEs in the LAGER COSMOS field. The LAEs inside the LAGER- 
z7OD1 are plotted in blue and those field LAEs in orange. For sources without 
detection in HSC-y we simply adopt the 2 ø lower limits to their HSC-y magni- 
tudes. Most sources with color > 2 in the plot are non-detected in HSC-y. The 
vertical lines plot the median colors (1.84 and 1.86) respectively. The tick mark of 
Lya EW is derived from the color assuming a redshift of 6.931 (corresponding to 
the center of NB964 transmission curve). 


bubble (as translucent spheres) in Fig. 3 and the predicted bubble size in 
Suppl. Tab. 1 for the spectroscopically confirmed LAEs in LAGER-z7OD1. 

Note the derived bubble sizes are model dependent. Ref2Slhas assumed 
a constant mean IGM density outside an HII bubble with a clumping factor of 
C =3 considered to take account of the density fluctuation. While increasing 
the clumpiness C would not significantly decrease the bubble size@2l, refl26 
has pointed out that the higher IGM density near the virial radius may reduce 
the predicted bubble sizes. The predicted bubble size is sensitive to the Ly- 
man continuum escape fraction which was assumed to be a constant of 0.2 in 
ref] But note the observational results at z < 4 suggest that only a small 
fraction of galaxies has a high escape fraction of > 10%53455] and the ioniz- 
ing continuum escape fraction could be mass-dependen&°, Moreover, in 
the model of ref!24 the Lya escape fraction was assumed to be a constant 
of 0.6, and both the bubble size and Lya luminosity tightly correlate with 
galaxy stellar mass. However, it is known that high redshift LAEs on aver- 
age are low mass and young galaxies, i.e., the Lya escape fraction is mass 
and stellar age dependent (e.g. ref 860), Consequently, our LAEs could be 
significantly less massive and younger than the model predictions of ref 26] 
and thus would be expected to have considerably smaller bubble sizes. 

Nevertheless, considering the mean neutral hydrogen fraction at 
z ~ 7.0 (xar = 0.2 — 0.4, H19) and the significant overdensity of 
LAGER-z70D1, it is reasonable to believe that the IGM in LAGER-z70D1 
was close to fully ionized at z ~ 7.0. However, it is yet uncertain whether 
the member LAEs we detected alone can produce such a giant bubble, as 
their predicted bubble sizes are remarkably model dependent. In the cases 
aforementioned, more undetected Lya fainter galaxies (with lower star 
formation rates and/or Lya escape fraction) could have contributed to the 
reionization around LAGER-z7OD1. 


EW/color distribution: The Lya EWs of narrowband selected LAEs can 
be well represented by the color between narrowband and the underlying 
broadband. For LAEs with spectroscopic redshifts one could derive more 
precise Lya EW measurements, after correcting for the wavelength depen- 
dence (non-boxcar shape) of the narrowband transmission and the redshift 
dependence of continuum contribution to narrowband photometry (see Fig. 
2 of H19). As not all LAEs (particularly those field LAEs) have spectro- 
scopic redshifts, here, we simply use the HSC-y — DECam-NB964 color as 
an indicator of Lya EW and compare the color distribution of LAEs inside 
the LAGER-z7OD1 with those field LAEs (Extended Data Fig. 1). Kol- 
mogorov—Smirmov test shows no statistical difference between two samples. 
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Supplementary Information 


Supplementary Table 1 | Properties of LAEs in the LAGER-z70D1. We list the 21 member LAEs in LAGER-z70D1. Column 1 lists the source IDs of 21 
LAEs. Columns 2 and 3 are the coordinates. Column 4 is the Lya photometric luminosity. Column 5 lists the redshifts inferred from the line center for spectroscopic 
confirmations. Columns 6 — 8 show their AUTO magnitude in the narrowbands DECam-NB964, HSC-NB973, and underlying broadband HSC-y (20 upper limits for 
non-detections). Column 9 lists the bubble size inferred using the relation in ref26 Column 10 is the source IDs in refs (hereafter H19). 


ID RA DEC log Liya Redshift DECam-NB964 HSC-NB973 HSC-y R ID in H19 
(erg s~") (mag) (mag) (mag) (cMpc) 
Spectroscopically Confirmed 
LAE-I 10:02:06.0 +02:06:46.3.  43.5470-03 6.938 23.08 + 0.06 23.77° 25.30 +0.33 = 14.5 COSMOS-1 
LAE-2 10:01:53.5 +02:04:59.8 43.337)3% 6.932 23.22 + 0.10 24.75° 24.11+0.12 11.5 | COSMOS-3 
LAE-3 10:03:10.5 +02:12:30.8 43.497}02 6.923 23.17 + 0.08 sid 25.15 +0.33 13.7 COSMOS-2 
LAE-4 10:03:32.7 +02:09:25.1 43.0479? 6.900 24.33 + 0.16 di 26.42 +0.70 85 | COSMOS-10 
LAE-5 10:03:30.7 +02:14:08.5 43.037}0% 6.899 24.37 + 0.13 iad 26.6340.58 8.0 N° 
LAE-6 10:03:28.0 +02:08:51.3 43.03+919 6.915 24.34 + 0.23 iok 26.45 +0.34 7.3 N° 
LAE-7 10:03:05.2 +02:09:14.7 42.7970? 6.945 24.69+0.19 24.28 +0.18 25.81 +0.35 61 N° 
LAE-9 10:03:16.0 +02:15:42.3.  42.707933 6.920 24.95 + 0.21 ee 26.13 +0.45 64 N° 
LAE-10 10:02:42.3 +02:06:55.2 42.561913 6.922 25.29 + 0.22 iok 26.42 +0.28 5.8 N° 
LAE-11 10:02:39.4 +02:07:12.1 42.691911 6.962 25.13 + 0.21 24.78° 26.84+0.53 6.4  COSMOS-41 
LAE-13 10:02:33.5 +02:07:09.5 42.681912 6.936 25.14 + 0.20 iok 26.85 +0.69 64 COSMOS-42 
LAE-15 10:02:23.4 +02:05:04.8 43.387 6.971 25.04 + 0.19 23.68° 26.41 +0.34 12.1 COSMOS-49 
LAE-16 10:02:32.9 +02:05:52.8 42.851699 6.915 24.82 + 0.18 +4 > 27.2 7.3. COSMOS-29 
LAE-17 10:03:33.5 +02:07:19.8 42.9475-99 6.917 24.61 + 0.20 žr > 27.2 7.9 COSMOS-17 
LAE-18 10:03:37.3 +02:07:36.7  42.8670°98 6.953 24.81 + 0.18 #4 > 27.2 7.3. COSMOS-27 
LAE-19 10:03:39.3 +02:07:47.2  42.6970-38 6.943 24.91 + 0.22 1k 25.93 +0.50 64  COSMOS-34 
Not Yet Confirmed 

LAE-8 10:02:09.0 +02:04:11.0 42.841077 iok 24.81 + 0.21 iok 26.8 + 0.62 7.2. COSMOS-25 
LAE-12 10:03:00.1 +02:14:49.5 42.8179 79 sa 24.76 + 0.20 ee 26.24+0.28 7.0 | COSMOS-24 
LAE-14 10:02:08.3 +02:06:59.6 42.791531 sal 24.86 + 0.25 sia 26.45+0.67 69 COSMOS-30 
LAE-20 10:02:47.1 +02:10:40.1 43.057 + 25.40 + 0.24 24 52° 26.86 +0.44 7.0 N° 
LAE-21 10:03:15.6 +02:18:11.3  42.87+9:99 sa 24.79+0.20 25.73 +0.51 > 27.2 6.9 N° 


“ LAE-5, 6, 9, 10 were not included in H19 as they are labelled as lower-grade candidates for various reasons (LAE-5: close to bad image 
regions; LAE-6: noisy signal in the NB image; LAE-9: adjacent to a foreground galaxy within 3”; LAE-10: with DECam-NB964 signal 
lower than 5c), but latterly got spectroscopically confirmed. 


> LAE-7, 20, and 21 are selected using the stacked i 
€ We adopt the HSC-NB973 magnitudes given in ref. 


m 


mage of DECam-NB964 and HSC-NB973 images. 


d We adopt the Lya luminosities given in ref) because their Lya lines locate in the red tail of the DECam-NB964 and their Lya luminosities 
will be severely underestimated if using DECam-NB964 magnitude. 
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Supplementary Figure 1 | Two- and one-dimensional spectra of 16 confirmed LAEs in LAGER-z7OD1. In the top panel the two-dimensional spectra (yellow is 
high flux and blue is low flux) are smoothed by a Gaussian kernel with 1 pixel for better illustration. The two black dashed lines (separated by 1°” vertically) represent 
the expected slit position of LAEs in the 2D spectra. In the middle panel, the blue lines are the one-dimensional spectra and the orange lines are the noise spectra. 
The grey regions represent the sky OH lines (imperfect sky line subtraction could yield artificial signals visible in the spectra). The dashed horizontal lines indicate 
zero-flux level and the black arrows mark the peak of the identified Lya line profiles. In the bottom panel, we plot the S/N spectra with the dashed horizontal lines 
showing zero S/N. Due to the flaws in the slit, LAE-16 shows a noisy 2D spectrum. However, a clear broad red wing of the line is revealed which falls in the skyline 
free region, and the line is considerably broader than artificial line signals in the spectrum. Thus, we identify it as a Lya line. 


